Skip to content

Conversation

Copy link

Copilot AI commented Dec 27, 2025

Description

This PR fixes the issue where the Python driver incorrectly retries TRUNCATE_ERROR (0x1003), which should not be retried as it indicates a permanent failure rather than a transient error.

Problem

  1. TruncateError was grouped with transient errors (OverloadedErrorMessage, IsBootstrappingErrorMessage, ServerError) that trigger retry_policy.on_request_error()
  2. The default retry policy returns RETRY_NEXT_HOST, causing unnecessary retries on other hosts
  3. When a table is dropped during truncate, users would see a confusing "unconfigured table" error from validation on the second retry instead of the original truncate error

Solution

  • Modified error handling to keep TruncateError in the error tuple with similar request errors for better code organization
  • Added a specific check within the error handling block to bypass retry logic for TruncateError
  • TruncateError now immediately rethrows without retry, sharing logging and metrics code with other request errors
  • Updated integration tests to verify TruncateError is not retried
  • Added comprehensive unit tests to document expected behavior

Note: Investigation confirmed that server error messages are already properly included in TruncateError exceptions via the base ErrorMessage class's __str__ method, so no changes were needed for that aspect.

Changes Made

  • cassandra/cluster.py: Refactored TruncateError handling for better readability while preventing retry logic
    • TruncateError remains in the error tuple (line 4779) to show its relationship to other request errors
    • Added specific check (lines 4784-4786) to bypass retry for TruncateError
    • Reduced code duplication by sharing logging and metrics handling
  • tests/integration/simulacron/test_policies.py: Updated tests to verify no retry behavior for TruncateError
  • tests/unit/test_truncate_error.py: Added unit tests for TruncateError message handling and error code verification

Testing

  • ✅ Unit tests pass for TruncateError behavior
  • ✅ Exception handling tests pass
  • ✅ CodeQL security analysis passed with no vulnerabilities
  • ✅ Manual verification confirms TruncateError no longer triggers retries
  • ✅ Type checking logic verified to work correctly after refactoring

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
Original prompt

This section details on the original issue you should resolve

<issue_title>Driver retries TRUNCATE_ERROR (0x1003), but shouldn't</issue_title>
<issue_description>The truncate_statement class converts any error happened during truncate table execution to TRUNCATE_ERROR. The driver calls retry_policy.on_request_error for TruncateError which returns RETRY_NEXT_HOST, this doesn't make much sense. For example, when a table is dropped during truncate table the user will see unconfigured table <table name> -- the validation triggered on the second retry, instead of the original error.

Another problem is that TruncateError Python class ignores the error message sent by the server -- it always returns unhelpful 'Error during truncate'.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

@mykaul
Copy link

mykaul commented Jan 7, 2026

@copilot - please complete the work here.

Copilot AI and others added 3 commits January 7, 2026 17:04
Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>
Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>
… test comment

Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>
Copy link
Author

Copilot AI commented Jan 7, 2026

@copilot - please complete the work here.

Work completed! The issue has been fixed in commit 9222b2e.

Summary of changes:

  1. Fixed retry behavior: TruncateError is no longer retried on other hosts, preventing users from seeing confusing secondary errors instead of the original truncate error
  2. Added test coverage: Updated integration tests and added unit tests to verify the new behavior
  3. Verified message handling: Confirmed that server error messages are properly included in TruncateError exceptions

The driver now correctly handles TruncateError as a permanent failure that should not be retried.

Copilot AI changed the title [WIP] Fix driver behavior on TruncateError handling Fix driver behavior on TruncateError handling - stop retrying on permanent failures Jan 7, 2026
if self._metrics is not None:
self._metrics.on_other_error()
cl = getattr(self.message, 'consistency_level', None)
retry = retry_policy.on_request_error(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - the code could have been more readable if you kept the TruncateError in line 4479, and here just did and if isntance(response, TruncateError): and then lines 4792-4793. Please fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored for better readability in commit 77d5c57. TruncateError is now kept in the error tuple on line 4779, with a specific check inside the block to handle it differently (no retry). This reduces duplication and makes the code flow clearer.

Keep TruncateError in the error tuple but check for it specifically
within the block to avoid retry logic. This makes it clearer that
TruncateError is similar to other request errors but requires
different handling.

Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>
Copilot AI requested a review from mykaul January 9, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Driver retries TRUNCATE_ERROR (0x1003), but shouldn't

2 participants